AWS ELB
Zhengliang Wang edited at Sat Jun 29 2024
Cloud

Elastic Load Balancing & Auto Scaling Group Section

  • High Availability
    • running application/ system in at least 2 AZ
    • vertical scaling: increase instance size
    • horizontal scaling: more instances
      • Auto Scaling Group
      • Load Balancer
    • Scalability: ability to have a larger load by making hardware stronger or more nodes.
    • Elasticity: once a system is scalable, elasticity means auto-scaling as system can scale based on the load. pay per use, match demand, optimize costs.
    • Agility: new IT resources are one click away. reduce time to make resouces available.

Elastic Load Balancing (ELB)

forward internet traffic to multiple servers downstream.

  • spread load across instances
  • expose a single point of access DNS to the system.
  • seamlessly handle failure of downstream instances
  • do regular health checks to instances
  • provide SSL termination
  • high availability across zones

4 kinds of load balancer:

  • Layer7 - application load balancer
    • HTTP / HTTPS / gRPC/
    • HTTP Routing
    • Static DNS URL
  • layer4 - network load balancer
    • TCP / UDP
    • high performance: millions per second
    • static IP through Elastic IP)
  • layer3 - gateway load balancer
    • Generic Network Virtualization Encapsulation (GENEVE) Protocol on IP packets
    • Route Traffic to Firewall
    • Intrusion detection)
  • layer 4 & 7 - classic load balancer

Auto Scaling Group (ASG)

  • goal:

    • scale out to match increasing load
    • scale int to match decreasing load
    • ensure minimum and maximum number of instances running
    • automatically register new instances to a load balancer
    • replace unhealthy instances
  • cost saving: only run at an optimal capacity

  • Strategies

    • Manual scaling
    • dynamic scaling
      • cloudwatch alarm is triggerd CPU > 70% to add
      • cloudwatch alarm is triggered CPU < 30% to drop
    • target tracking scaling
      • keep CPU around 40%
    • Scheduled Scaling
    • Predictive Scaling
      • use machine learning to predict future traffic
      • automatically provision the right number of instances